The Cramer Distance as a Solution to Biased Wasserstein Gradients

نویسندگان

  • Marc G. Bellemare
  • Ivo Danihelka
  • Will Dabney
  • Shakir Mohamed
  • Balaji Lakshminarayanan
  • Stephan Hoyer
  • Rémi Munos
چکیده

The Wasserstein probability metric has received much attention from the machine learning community. Unlike the Kullback-Leibler divergence, which strictly measures change in probability, the Wasserstein metric reflects the underlying geometry between outcomes. The value of being sensitive to this geometry has been demonstrated, among others, in ordinal regression and generative modelling. In this paper we describe three natural properties of probability divergences that reflect requirements from machine learning: sum invariance, scale sensitivity, and unbiased sample gradients. The Wasserstein metric possesses the first two properties but, unlike the Kullback-Leibler divergence, does not possess the third. We provide empirical evidence suggesting that this is a serious issue in practice. Leveraging insights from probabilistic forecasting we propose an alternative to the Wasserstein metric, the Cramér distance. We show that the Cramér distance possesses all three desired properties, combining the best of the Wasserstein and Kullback-Leibler divergences. To illustrate the relevance of the Cramér distance in practice we design a new algorithm, the Cramér Generative Adversarial Network (GAN), and show that it performs significantly better than the related Wasserstein GAN.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Demystifying MMD GANs

We investigate the training and performance of generative adversarial networks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs. As our main theoretical contribution, we clarify the situation with bias in GAN loss functions raised by recent work: we show that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GANs are unbiased, but learning...

متن کامل

Limit Distribution of Distances in Biased Random Tries

The trie is a sort of digital tree. Ideally, to achieve balance, the trie should grow from an unbiased source generating keys of bits with equal likelihoods. In practice, the lack of bias is not always guaranteed. We investigate the distance between randomly selected pairs of nodes among the keys in a biased trie. This research complements that of Christophi and Mahmoud (2005); however, the res...

متن کامل

Generalized Wasserstein distance and its application to transport equations with source

In this article, we generalize the Wasserstein distance to measures with di erent masses. We study the properties of such distance. In particular, we show that it metrizes weak convergence for tight sequences. We use this generalized Wasserstein distance to study a transport equation with source, in which both the vector eld and the source depend on the measure itself. We prove existence and un...

متن کامل

Distribution of Inter-Node Distances in Digital Trees

We investigate distances between pairs of nodes in digital trees (digital search trees (DST), and tries). By analytic techniques, such as the Mellin Transform and poissonization, we describe a program to determine the moments of these distances. The program is illustrated on the mean and variance. One encounters delayed Mellin transform equations, which we solve by inspection. Interestingly, th...

متن کامل

On Wasserstein Reinforcement Learning and the Fokker-Planck equation

Policy gradients methods often achieve better performance when the change in policy is limited to a small Kullback-Leibler divergence. We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region). This is done in the discrete and continuous multi-armed bandit settings with entropy regularisation. We show that in the small steps limit with re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1705.10743  شماره 

صفحات  -

تاریخ انتشار 2017